{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "85bb873e-562a-4d60-9536-b6b88aecb5c2",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Working with GPUs in OpenVINO™\n",
    "\n",
    "\n",
    "#### Table of contents:\n",
    "\n",
    "- [Introduction](#Introduction)\n",
    "    - [Install required packages](#Install-required-packages)\n",
    "- [Checking GPUs with Query Device](#Checking-GPUs-with-Query-Device)\n",
    "    - [List GPUs with core.available_devices](#List-GPUs-with-core.available_devices)\n",
    "    - [Check Properties with core.get_property](#Check-Properties-with-core.get_property)\n",
    "    - [Brief Descriptions of Key Properties](#Brief-Descriptions-of-Key-Properties)\n",
    "- [Compiling a Model on GPU](#Compiling-a-Model-on-GPU)\n",
    "    - [Download a Model](#Download-a-Model)\n",
    "    - [Compile with Default Configuration](#Compile-with-Default-Configuration)\n",
    "    - [Reduce Compile Time through Model Caching](#Reduce-Compile-Time-through-Model-Caching)\n",
    "    - [Throughput and Latency Performance Hints](#Throughput-and-Latency-Performance-Hints)\n",
    "    - [Using Multiple GPUs with Multi-Device and Cumulative Throughput](#Using-Multiple-GPUs-with-Multi-Device-and-Cumulative-Throughput)\n",
    "- [Performance Comparison with benchmark_app](#Performance-Comparison-with-benchmark_app)\n",
    "    - [CPU vs GPU with Latency Hint](#CPU-vs-GPU-with-Latency-Hint)\n",
    "    - [CPU vs GPU with Throughput Hint](#CPU-vs-GPU-with-Throughput-Hint)\n",
    "    - [Single GPU vs Multiple GPUs](#Single-GPU-vs-Multiple-GPUs)\n",
    "- [Basic Application Using GPUs](#Basic-Application-Using-GPUs)\n",
    "    - [Import Necessary Packages](#Import-Necessary-Packages)\n",
    "    - [Compile the Model](#Compile-the-Model)\n",
    "    - [Load and Preprocess Video Frames](#Load-and-Preprocess-Video-Frames)\n",
    "    - [Define Model Output Classes](#Define-Model-Output-Classes)\n",
    "    - [Set up Asynchronous Pipeline](#Set-up-Asynchronous-Pipeline)\n",
    "        - [Callback Definition](#Callback-Definition)\n",
    "        - [Create Async Pipeline](#Create-Async-Pipeline)\n",
    "    - [Perform Inference](#Perform-Inference)\n",
    "    - [Process Results](#Process-Results)\n",
    "- [Conclusion](#Conclusion)\n",
    "\n",
    "\n",
    "### Installation Instructions\n",
    "\n",
    "This is a self-contained example that relies solely on its own code.\n",
    "\n",
    "We recommend  running the notebook in a virtual environment. You only need a Jupyter server to start.\n",
    "For details, please refer to [Installation Guide](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/README.md#-installation-guide).\n",
    "\n",
    "<img referrerpolicy=\"no-referrer-when-downgrade\" src=\"https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/gpu-device/gpu-device.ipynb\" />\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "d6c8277d-29b3-4172-8600-fcbbbbdbafd2",
   "metadata": {},
   "source": [
    "This tutorial provides a high-level overview of working with Intel GPUs in OpenVINO. It shows how to use Query Device to list system GPUs and check their properties, and it explains some of the key properties. It shows how to compile a model on GPU with performance hints and how to use multiple GPUs using MULTI or CUMULATIVE_THROUGHPUT. \n",
    "\n",
    "The tutorial also shows example commands for benchmark_app that can be run to compare GPU performance in different configurations. It also provides the code for a basic end-to-end application that compiles a model on GPU and uses it to run inference."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "3e7253ed-d8a1-4475-8e83-263336639157",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Introduction\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "ab1f9f13-63a4-4b2d-95e3-9a82c1e9cebd",
   "metadata": {},
   "source": [
    "Originally, graphic processing units (GPUs) began as specialized chips, developed to accelerate the rendering of computer graphics. In contrast to CPUs, which have few but powerful cores, GPUs have many more specialized cores, making them ideal for workloads that can be parallelized into simpler tasks. Nowadays, one such workload is deep learning, where GPUs can easily accelerate inference of neural networks by splitting operations across multiple cores.\n",
    "\n",
    "OpenVINO supports inference on Intel integrated GPUs (which are included with most [Intel® Core™ desktop and mobile processors](https://www.intel.com/content/www/us/en/products/details/processors/core.html)) or on Intel discrete GPU products like the [Intel® Arc™ A-Series Graphics cards](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/arc.html) and [Intel® Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/flex-series.html). To get started, first [install OpenVINO](https://docs.openvino.ai/2024/get-started/install-openvino.html) on a system equipped with one or more Intel GPUs. Follow the [GPU configuration instructions](https://docs.openvino.ai/2024/get-started/configurations/configurations-intel-gpu.html) to configure OpenVINO to work with your GPU. Then, read on to learn how to accelerate inference with GPUs in OpenVINO!"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "7a419978",
   "metadata": {},
   "source": [
    "### Install required packages\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d540e8ae",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install -q \"openvino>=2024.4.0\" \"opencv-python\" \"tqdm\" \"huggingface_hub\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "d2f35f17-5209-43d6-b3b5-db35efb1f42e",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Checking GPUs with Query Device\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "c84c37c2-b24b-4c9f-9ab8-d3a60b8393ec",
   "metadata": {},
   "source": [
    "In this section, we will see how to list the available GPUs and check their properties. Some of the key properties will also be defined."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "f0e91abc-6118-4532-9817-6ef44c51b0c2",
   "metadata": {
    "tags": []
   },
   "source": [
    "### List GPUs with core.available_devices\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "e76b6152-3915-415a-987b-c30935462007",
   "metadata": {},
   "source": [
    "OpenVINO Runtime provides the `available_devices` method for checking which devices are available for inference. The following code will output a list of compatible OpenVINO devices, in which Intel GPUs should appear."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "c8260bef-63c4-45f3-9ffd-4cc3ac892680",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:29:38.460335Z",
     "start_time": "2023-05-29T16:29:37.043277Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['CPU', 'GPU']"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import openvino as ov\n",
    "\n",
    "core = ov.Core()\n",
    "core.available_devices"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "a95aef31-bf6f-4d7e-85a5-97dc6bead645",
   "metadata": {},
   "source": [
    "Note that GPU devices are numbered starting at 0, where the integrated GPU always takes the id `0` if the system has one. For instance, if the system has a CPU, an integrated and discrete GPU, we should expect to see a list like this: `['CPU', 'GPU.0', 'GPU.1']`. To simplify its use, the \"GPU.0\" can also be addressed with just \"GPU\". For more details, see the [Device Naming Convention](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/gpu-device.html#device-naming-convention) section."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "4e266c79-7bb9-4907-9204-87b688087cbf",
   "metadata": {},
   "source": [
    "If the GPUs are installed correctly on the system and still do not appear in the list, follow the steps described [here](https://docs.openvino.ai/2024/get-started/configurations/configurations-intel-gpu.html) to configure your GPU drivers to work with OpenVINO. Once we have the GPUs working with OpenVINO, we can proceed with the next sections."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "c0491c07-0b6d-483f-a4d6-d3f7b5ec0c81",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Check Properties with core.get_property\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "54399a50-31cf-48bd-bf98-6dea0c0b1b20",
   "metadata": {},
   "source": [
    "To get information about the GPUs, we can use device properties. In OpenVINO, devices have properties that describe their characteristics and configuration. Each property has a name and associated value that can be queried with the `get_property` method."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "bf7509b6-67e3-46fb-bb23-5f46c58b0fc1",
   "metadata": {},
   "source": [
    "To get the value of a property, such as the device name, we can use the `get_property` method as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "eacdd5be-5d75-41d5-a51b-a376cb063b64",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:29:38.460645Z",
     "start_time": "2023-05-29T16:29:38.455963Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Intel(R) Graphics [0x46a6] (iGPU)'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import openvino.properties as props\n",
    "\n",
    "\n",
    "device = \"GPU\"\n",
    "\n",
    "core.get_property(device, props.device.full_name)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "aac3129a-129f-49aa-aba0-71ae1e892ada",
   "metadata": {},
   "source": [
    "Each device also has a specific property called `SUPPORTED_PROPERTIES`, that enables viewing all the available properties in the device. We can check the value for each property by simply looping through the dictionary returned by `core.get_property(\"GPU\", props.supported_properties)` and then querying for that property."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "cfb073e8-c997-40db-92fa-93334731285c",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:29:38.466335Z",
     "start_time": "2023-05-29T16:29:38.459390Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "GPU SUPPORTED_PROPERTIES:\n",
      "\n",
      "AVAILABLE_DEVICES             : ['0']\n",
      "RANGE_FOR_ASYNC_INFER_REQUESTS: (1, 2, 1)\n",
      "RANGE_FOR_STREAMS             : (1, 2)\n",
      "OPTIMAL_BATCH_SIZE            : 1\n",
      "MAX_BATCH_SIZE                : 1\n",
      "CACHING_PROPERTIES            : {'GPU_UARCH_VERSION': 'RO', 'GPU_EXECUTION_UNITS_COUNT': 'RO', 'GPU_DRIVER_VERSION': 'RO', 'GPU_DEVICE_ID': 'RO'}\n",
      "DEVICE_ARCHITECTURE           : GPU: v12.0.0\n",
      "FULL_DEVICE_NAME              : Intel(R) Graphics [0x46a6] (iGPU)\n",
      "DEVICE_UUID                   : UNSUPPORTED TYPE\n",
      "DEVICE_TYPE                   : Type.INTEGRATED\n",
      "DEVICE_GOPS                   : UNSUPPORTED TYPE\n",
      "OPTIMIZATION_CAPABILITIES     : ['FP32', 'BIN', 'FP16', 'INT8']\n",
      "GPU_DEVICE_TOTAL_MEM_SIZE     : UNSUPPORTED TYPE\n",
      "GPU_UARCH_VERSION             : 12.0.0\n",
      "GPU_EXECUTION_UNITS_COUNT     : 96\n",
      "GPU_MEMORY_STATISTICS         : UNSUPPORTED TYPE\n",
      "PERF_COUNT                    : False\n",
      "MODEL_PRIORITY                : Priority.MEDIUM\n",
      "GPU_HOST_TASK_PRIORITY        : Priority.MEDIUM\n",
      "GPU_QUEUE_PRIORITY            : Priority.MEDIUM\n",
      "GPU_QUEUE_THROTTLE            : Priority.MEDIUM\n",
      "GPU_ENABLE_LOOP_UNROLLING     : True\n",
      "CACHE_DIR                     : \n",
      "PERFORMANCE_HINT              : PerformanceMode.UNDEFINED\n",
      "COMPILATION_NUM_THREADS       : 20\n",
      "NUM_STREAMS                   : 1\n",
      "PERFORMANCE_HINT_NUM_REQUESTS : 0\n",
      "INFERENCE_PRECISION_HINT      : <Type: 'undefined'>\n",
      "DEVICE_ID                     : 0\n"
     ]
    }
   ],
   "source": [
    "print(f\"{device} SUPPORTED_PROPERTIES:\\n\")\n",
    "supported_properties = core.get_property(device, props.supported_properties)\n",
    "indent = len(max(supported_properties, key=len))\n",
    "\n",
    "for property_key in supported_properties:\n",
    "    if property_key not in (\n",
    "        \"SUPPORTED_METRICS\",\n",
    "        \"SUPPORTED_CONFIG_KEYS\",\n",
    "        \"SUPPORTED_PROPERTIES\",\n",
    "    ):\n",
    "        try:\n",
    "            property_val = core.get_property(device, property_key)\n",
    "        except TypeError:\n",
    "            property_val = \"UNSUPPORTED TYPE\"\n",
    "        print(f\"{property_key:<{indent}}: {property_val}\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "88546b71-2ae8-4519-b9b7-a54f444e204f",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Brief Descriptions of Key Properties\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "6524096f-cc77-4dca-9e52-e5a3c2c0523b",
   "metadata": {},
   "source": [
    "Each device has several properties as seen in the last command. Some of the key properties are:\n",
    "\n",
    "* `FULL_DEVICE_NAME` - The product name of the GPU and whether it is an integrated or discrete GPU (iGPU or dGPU).\n",
    "* `OPTIMIZATION_CAPABILITIES` - The model data types (INT8, FP16, FP32, etc) that are supported by this GPU.\n",
    "* `GPU_EXECUTION_UNITS_COUNT` - The execution cores available in the GPU's architecture, which is a relative measure of the GPU's processing power.\n",
    "* `RANGE_FOR_STREAMS` - The number of processing streams available on the GPU that can be used to execute parallel inference requests. When compiling a model in LATENCY or THROUGHPUT mode, OpenVINO will automatically select the best number of streams for low latency or high throughput.\n",
    "* `PERFORMANCE_HINT` - A high-level way to tune the device for a specific performance metric, such as latency or throughput, without worrying about device-specific settings.\n",
    "* `CACHE_DIR` - The directory where the model cache data is stored to speed up compilation time.\n",
    "\n",
    "\n",
    "To learn more about devices and properties, see the [Query Device Properties](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/query-device-properties.html) page."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "131bd0f9-9b1e-4fc1-ad09-3960e1d6c50f",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Compiling a Model on GPU\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "cea15935-2373-4029-8bc0-998558a6defe",
   "metadata": {},
   "source": [
    "Now, we know how to list the GPUs in the system and check their properties. We can easily use one for compiling and running models with OpenVINO [GPU plugin](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/gpu-device.html)."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "db4600db-faea-49cc-b9b9-93415223a38f",
   "metadata": {},
   "source": [
    "### Download a Model\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "821e29d1-6021-4552-a926-cb35bd6af776",
   "metadata": {},
   "source": [
    "This tutorial uses the `ssdlite_mobilenet_v2` model. The `ssdlite_mobilenet_v2` model is used for object detection. The model was trained on [Common Objects in Context (COCO)](https://cocodataset.org/#home) dataset version with 91 categories of object. For details, see the [paper](https://arxiv.org/abs/1801.04381)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "061a741f-2fc7-430e-b2f7-21d229533447",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:29:43.220943Z",
     "start_time": "2023-05-29T16:29:38.466624Z"
    }
   },
   "outputs": [],
   "source": [
    "import huggingface_hub as hf_hub\n",
    "from pathlib import Path\n",
    "\n",
    "# Fetch `notebook_utils` module\n",
    "import requests\n",
    "\n",
    "if not Path(\"notebook_utils.py\").exists():\n",
    "    r = requests.get(\n",
    "        url=\"https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py\",\n",
    "    )\n",
    "\n",
    "    open(\"notebook_utils.py\", \"w\").write(r.text)\n",
    "\n",
    "# A directory where the model will be downloaded.\n",
    "base_model_dir = Path(\"./model\").expanduser()\n",
    "\n",
    "model_name = \"ssdlite_mobilenet_v2_fp16\"\n",
    "\n",
    "ov_model_path = base_model_dir / model_name / f\"{model_name}.xml\"\n",
    "\n",
    "if not (ov_model_path).exists():\n",
    "    hf_hub.snapshot_download(\"katuni4ka/ssdlite_mobilenet_v2_fp16\", local_dir=base_model_dir / model_name)\n",
    "\n",
    "model = core.read_model(ov_model_path)\n",
    "\n",
    "# Read more about telemetry collection at https://github.com/openvinotoolkit/openvino_notebooks?tab=readme-ov-file#-telemetry\n",
    "from notebook_utils import collect_telemetry\n",
    "\n",
    "collect_telemetry(\"gpu-device.ipynb\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "4cc043aa-8729-45bc-9e98-6b3daea9f271",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Compile with Default Configuration\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "6999d5e8-3429-4e2b-a845-c5569ec1895f",
   "metadata": {},
   "source": [
    "When the model is ready, first we need to read it, using the `read_model` method. Then, we can use the `compile_model` method and specify the name of the device we want to compile the model on, in this case, \"GPU\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "57f7d00e-ebe5-456d-8643-112e9d8c860d",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:30:08.223754Z",
     "start_time": "2023-05-29T16:30:06.427679Z"
    }
   },
   "outputs": [],
   "source": [
    "compiled_model = core.compile_model(model, device)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "18c58745-69c3-4034-b221-bc4cf8a5b27b",
   "metadata": {},
   "source": [
    "If you have multiple GPUs in the system, you can specify which one to use by using \"GPU.0\", \"GPU.1\", etc. Any of the device names returned by the `available_devices` method are valid device specifiers. You may also use \"AUTO\", which will automatically select the best device for inference (which is often the GPU). To learn more about AUTO plugin, visit the [Automatic Device Selection](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/auto-device-selection.html) page as well as the [AUTO device tutorial](../auto-device/auto-device.ipynb)."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "2747c592-29f1-49e9-9ed4-5bf1a39e5c8a",
   "metadata": {},
   "source": [
    "### Reduce Compile Time through Model Caching\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "22069d88-1d29-4978-ba32-2613f9d2143d",
   "metadata": {},
   "source": [
    "Depending on the model used, device-specific optimizations and network compilations can cause the compile step to be time-consuming, especially with larger models, which may lead to bad user experience in the application, in which they are used. To solve this, OpenVINO can cache the model once it is compiled on supported devices and reuse it in later `compile_model` calls by simply setting a cache folder beforehand. For instance, to cache the same model we compiled above, we can do the following:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "20ee3bba-94e6-4a82-9c79-31d86a89ece4",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:30:09.876198Z",
     "start_time": "2023-05-29T16:30:08.223528Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Cache enabled (first time) - compile time: 1.692436695098877s\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "from pathlib import Path\n",
    "\n",
    "# Create cache folder\n",
    "cache_folder = Path(\"cache\")\n",
    "cache_folder.mkdir(exist_ok=True)\n",
    "\n",
    "start = time.time()\n",
    "core = ov.Core()\n",
    "\n",
    "# Set cache folder\n",
    "core.set_property({props.cache_dir(): cache_folder})\n",
    "\n",
    "# Compile the model as before\n",
    "model = core.read_model(ov_model_path)\n",
    "compiled_model = core.compile_model(model, device)\n",
    "print(f\"Cache enabled (first time) - compile time: {time.time() - start}s\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "f7915734-8138-43bc-9334-8c81a919fa26",
   "metadata": {},
   "source": [
    "To get an idea of the effect that caching can have, we can measure the compile times with caching enabled and disabled as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "8ce447e8-19a5-4ead-9710-195da9f58640",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:30:12.132441Z",
     "start_time": "2023-05-29T16:30:09.878527Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Cache enabled  - compile time: 0.26888394355773926s\n",
      "Cache disabled - compile time: 1.982884168624878s\n"
     ]
    }
   ],
   "source": [
    "start = time.time()\n",
    "core = ov.Core()\n",
    "core.set_property({props.cache_dir(): \"cache\"})\n",
    "model = core.read_model(model=ov_model_path)\n",
    "compiled_model = core.compile_model(model, device)\n",
    "print(f\"Cache enabled  - compile time: {time.time() - start}s\")\n",
    "\n",
    "start = time.time()\n",
    "core = ov.Core()\n",
    "model = core.read_model(ov_model_path)\n",
    "compiled_model = core.compile_model(model, device)\n",
    "print(f\"Cache disabled - compile time: {time.time() - start}s\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "7328633a-ba7d-422d-9f62-446ca7db142e",
   "metadata": {},
   "source": [
    "The actual time improvements will depend on the environment as well as the model being used but it is definitely something to consider when optimizing an application. To read more about this, see the [Model Caching](https://docs.openvino.ai/2024/openvino-workflow/running-inference/optimize-inference/optimizing-latency/model-caching-overview.html) docs."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "2aa916d7-4626-4f54-9b34-954eca19aafb",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Throughput and Latency Performance Hints\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "a77cbc96-8832-4ed9-9109-30b0efce9da9",
   "metadata": {},
   "source": [
    "To simplify device and pipeline configuration, OpenVINO provides high-level performance hints that automatically set the batch size and number of parallel threads to use for inference. The \"LATENCY\" performance hint optimizes for fast inference times while the \"THROUGHPUT\" performance hint optimizes for high overall bandwidth or FPS."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "7077b662-22f3-4c52-9c80-e5ac1309c482",
   "metadata": {},
   "source": [
    "To use the \"LATENCY\" performance hint, add `{hints.performance_mode(): hints.PerformanceMode.LATENCY}` when compiling the model as shown below. For GPUs, this automatically minimizes the batch size and number of parallel streams such that all of the compute resources can focus on completing a single inference as fast as possible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "4465a003-03ff-497f-94b5-3a194b6e386f",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:30:14.212280Z",
     "start_time": "2023-05-29T16:30:12.131948Z"
    }
   },
   "outputs": [],
   "source": [
    "import openvino.properties.hint as hints\n",
    "\n",
    "\n",
    "compiled_model = core.compile_model(model, device, {hints.performance_mode(): hints.PerformanceMode.LATENCY})"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "06589f38-ce35-457f-8395-a4a3f6327ea0",
   "metadata": {},
   "source": [
    "To use the \"THROUGHPUT\" performance hint, add `{hints.performance_mode(): hints.PerformanceMode.THROUGHPUT}` when compiling the model. For GPUs, this creates multiple processing streams to efficiently utilize all the execution cores and optimizes the batch size to fill the available memory."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "65fd2215-00d7-4838-aa76-040fcee18a76",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:30:16.335240Z",
     "start_time": "2023-05-29T16:30:14.213700Z"
    }
   },
   "outputs": [],
   "source": [
    "compiled_model = core.compile_model(model, device, {hints.performance_mode(): hints.PerformanceMode.THROUGHPUT})"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "b0a98815-669e-4868-a5f3-7104d6887fb3",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Using Multiple GPUs with Multi-Device and Cumulative Throughput\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "0da18381-edfc-4d3b-942a-87d6b0cbd5d3",
   "metadata": {},
   "source": [
    "The latency and throughput hints mentioned above are great and can make a difference when used adequately but they usually use just one device, either due to the [AUTO plugin](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/auto-device-selection.html#how-auto-works) or by manual specification of the device name as above. When we have multiple devices, such as an integrated and discrete GPU, we may use both at the same time to improve the utilization of the resources. In order to do this, OpenVINO provides a virtual device called [MULTI](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/multi-device.html), which is just a combination of the existent devices that knows how to split inference work between them, leveraging the capabilities of each device.\n",
    "\n",
    "As an example, if we want to use both integrated and discrete GPUs and the CPU at the same time, we can compile the model as follows:\n",
    "\n",
    "`\n",
    "compiled_model = core.compile_model(model=model, device_name=\"MULTI:GPU.1,GPU.0,CPU\")\n",
    "`\n",
    "\n",
    "Note that we always need to explicitly specify the device list for MULTI to work, otherwise MULTI does not know which devices are available for inference. However, this is not the only way to use multiple devices in OpenVINO. There is another performance hint called \"CUMULATIVE_THROUGHPUT\" that works similar to MULTI, except it uses the devices automatically selected by AUTO. This way, we do not need to manually specify devices to use. Below is an example showing how to use \"CUMULATIVE_THROUGHPUT\", equivalent to the MULTI one:\n",
    "\n",
    "`\n",
    "\n",
    "\n",
    "compiled_model = core.compile_model(model=model, device_name=\"AUTO\", config={hints.performance_mode(): hints.PerformanceMode.CUMULATIVE_THROUGHPUT})\n",
    "`\n",
    "\n",
    "> **Important**: **The “THROUGHPUT”, “MULTI”, and “CUMULATIVE_THROUGHPUT” modes are only applicable to asynchronous inferencing pipelines. The example at the end of this article shows how to set up an asynchronous pipeline that takes advantage of parallelism to increase throughput.** To learn more, see [Asynchronous Inferencing](https://docs.openvino.ai/2024/documentation/openvino-extensibility/openvino-plugin-library/asynch-inference-request.html) in OpenVINO as well as the [Asynchronous Inference notebook](../async-api/async-api.ipynb)."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "99351910-c54c-4f55-b1ce-eb74125a9dbd",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Performance Comparison with benchmark_app\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "93b411e5-5c56-462a-82e3-22ba020f64c1",
   "metadata": {},
   "source": [
    "Given all the different options available when compiling a model, it may be difficult to know which settings work best for a certain application. Thankfully, OpenVINO provides  `benchmark_app` - a performance benchmarking tool."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "f5921100-0f41-42d3-8f90-703c20cb11bd",
   "metadata": {},
   "source": [
    "The basic syntax of `benchmark_app` is as follows:\n",
    "\n",
    "`\n",
    "benchmark_app -m PATH_TO_MODEL -d TARGET_DEVICE -hint {throughput,cumulative_throughput,latency,none}\n",
    "`\n",
    "\n",
    "where `TARGET_DEVICE` is any device shown by the `available_devices` method as well as the MULTI and AUTO devices we saw previously, and the value of hint should be one of the values between brackets. \n",
    "\n",
    "Note that benchmark_app only requires the model path to run but both the device and hint arguments will be useful to us. For more advanced usages, the tool itself has other options that can be checked by running `benchmark_app -h` or reading the [docs](https://docs.openvino.ai/2024/learn-openvino/openvino-samples/benchmark-tool.html). The following example shows how to benchmark a simple model, using a GPU with a latency focus:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "74c76637-f86c-41d5-8846-6e3946126084",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:31:19.815223Z",
     "start_time": "2023-05-29T16:30:16.335801Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Step 1/11] Parsing and validating input arguments\n",
      "[ INFO ] Parsing input parameters\n",
      "[Step 2/11] Loading OpenVINO Runtime\n",
      "[ INFO ] OpenVINO:\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] Device info:\n",
      "[ INFO ] GPU\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] \n",
      "[Step 3/11] Setting device configuration\n",
      "[Step 4/11] Reading model files\n",
      "[ INFO ] Loading model files\n",
      "[ INFO ] Read model took 14.02 ms\n",
      "[ INFO ] Original model I/O parameters:\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 5/11] Resizing model to match image sizes and given batch\n",
      "[ INFO ] Model batch size: 1\n",
      "[Step 6/11] Configuring input of the model\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 7/11] Loading the model to the device\n",
      "[ INFO ] Compile model took 1932.50 ms\n",
      "[Step 8/11] Querying optimal runtime parameters\n",
      "[ INFO ] Model:\n",
      "[ INFO ]   NETWORK_NAME: frozen_inference_graph\n",
      "[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1\n",
      "[ INFO ]   PERF_COUNT: False\n",
      "[ INFO ]   MODEL_PRIORITY: Priority.MEDIUM\n",
      "[ INFO ]   GPU_HOST_TASK_PRIORITY: Priority.MEDIUM\n",
      "[ INFO ]   GPU_QUEUE_PRIORITY: Priority.MEDIUM\n",
      "[ INFO ]   GPU_QUEUE_THROTTLE: Priority.MEDIUM\n",
      "[ INFO ]   GPU_ENABLE_LOOP_UNROLLING: True\n",
      "[ INFO ]   CACHE_DIR: \n",
      "[ INFO ]   PERFORMANCE_HINT: PerformanceMode.LATENCY\n",
      "[ INFO ]   COMPILATION_NUM_THREADS: 20\n",
      "[ INFO ]   NUM_STREAMS: 1\n",
      "[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0\n",
      "[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'undefined'>\n",
      "[ INFO ]   DEVICE_ID: 0\n",
      "[Step 9/11] Creating infer requests and preparing input tensors\n",
      "[ WARNING ] No input files were given for input 'image_tensor'!. This input will be filled with random values!\n",
      "[ INFO ] Fill input 'image_tensor' with random values \n",
      "[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests, limits: 60000 ms duration)\n",
      "[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).\n",
      "[ INFO ] First inference took 6.17 ms\n",
      "[Step 11/11] Dumping statistics report\n",
      "[ INFO ] Count:            12710 iterations\n",
      "[ INFO ] Duration:         60006.58 ms\n",
      "[ INFO ] Latency:\n",
      "[ INFO ]    Median:        4.52 ms\n",
      "[ INFO ]    Average:       4.57 ms\n",
      "[ INFO ]    Min:           3.13 ms\n",
      "[ INFO ]    Max:           17.62 ms\n",
      "[ INFO ] Throughput:   211.81 FPS\n"
     ]
    }
   ],
   "source": [
    "!benchmark_app -m {ov_model_path} -d GPU -hint latency"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "4c27f62b-6053-4a8a-a554-064b04fa2ac7",
   "metadata": {},
   "source": [
    "For completeness, let us list here some of the comparisons we may want to do by varying the device and hint used. Note that the actual performance may depend on the hardware used. Generally, we should expect GPU to be better than CPU, whereas multiple GPUs should be better than a single GPU as long as there is enough work for each of them."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "9d8a5020-9e9c-48ae-8386-8078e2bf08a2",
   "metadata": {},
   "source": [
    "#### CPU vs GPU with Latency Hint\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "81f487ea-da76-4d6f-9670-9204965ded67",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:32:20.310812Z",
     "start_time": "2023-05-29T16:31:19.816075Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Step 1/11] Parsing and validating input arguments\n",
      "[ INFO ] Parsing input parameters\n",
      "[Step 2/11] Loading OpenVINO Runtime\n",
      "[ INFO ] OpenVINO:\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] Device info:\n",
      "[ INFO ] CPU\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] \n",
      "[Step 3/11] Setting device configuration\n",
      "[Step 4/11] Reading model files\n",
      "[ INFO ] Loading model files\n",
      "[ INFO ] Read model took 30.38 ms\n",
      "[ INFO ] Original model I/O parameters:\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 5/11] Resizing model to match image sizes and given batch\n",
      "[ INFO ] Model batch size: 1\n",
      "[Step 6/11] Configuring input of the model\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 7/11] Loading the model to the device\n",
      "[ INFO ] Compile model took 127.72 ms\n",
      "[Step 8/11] Querying optimal runtime parameters\n",
      "[ INFO ] Model:\n",
      "[ INFO ]   NETWORK_NAME: frozen_inference_graph\n",
      "[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1\n",
      "[ INFO ]   NUM_STREAMS: 1\n",
      "[ INFO ]   AFFINITY: Affinity.CORE\n",
      "[ INFO ]   INFERENCE_NUM_THREADS: 14\n",
      "[ INFO ]   PERF_COUNT: False\n",
      "[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>\n",
      "[ INFO ]   PERFORMANCE_HINT: PerformanceMode.LATENCY\n",
      "[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0\n",
      "[Step 9/11] Creating infer requests and preparing input tensors\n",
      "[ WARNING ] No input files were given for input 'image_tensor'!. This input will be filled with random values!\n",
      "[ INFO ] Fill input 'image_tensor' with random values \n",
      "[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests, limits: 60000 ms duration)\n",
      "[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).\n",
      "[ INFO ] First inference took 4.42 ms\n",
      "[Step 11/11] Dumping statistics report\n",
      "[ INFO ] Count:            15304 iterations\n",
      "[ INFO ] Duration:         60005.72 ms\n",
      "[ INFO ] Latency:\n",
      "[ INFO ]    Median:        3.87 ms\n",
      "[ INFO ]    Average:       3.88 ms\n",
      "[ INFO ]    Min:           3.49 ms\n",
      "[ INFO ]    Max:           5.95 ms\n",
      "[ INFO ] Throughput:   255.04 FPS\n"
     ]
    }
   ],
   "source": [
    "!benchmark_app -m {ov_model_path} -d CPU -hint latency"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "350075ea-3c4b-467d-bd24-808e51745fcd",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:33:24.123100Z",
     "start_time": "2023-05-29T16:32:20.313617Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Step 1/11] Parsing and validating input arguments\n",
      "[ INFO ] Parsing input parameters\n",
      "[Step 2/11] Loading OpenVINO Runtime\n",
      "[ INFO ] OpenVINO:\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] Device info:\n",
      "[ INFO ] GPU\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] \n",
      "[Step 3/11] Setting device configuration\n",
      "[Step 4/11] Reading model files\n",
      "[ INFO ] Loading model files\n",
      "[ INFO ] Read model took 14.65 ms\n",
      "[ INFO ] Original model I/O parameters:\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 5/11] Resizing model to match image sizes and given batch\n",
      "[ INFO ] Model batch size: 1\n",
      "[Step 6/11] Configuring input of the model\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 7/11] Loading the model to the device\n",
      "[ INFO ] Compile model took 2254.81 ms\n",
      "[Step 8/11] Querying optimal runtime parameters\n",
      "[ INFO ] Model:\n",
      "[ INFO ]   NETWORK_NAME: frozen_inference_graph\n",
      "[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1\n",
      "[ INFO ]   PERF_COUNT: False\n",
      "[ INFO ]   MODEL_PRIORITY: Priority.MEDIUM\n",
      "[ INFO ]   GPU_HOST_TASK_PRIORITY: Priority.MEDIUM\n",
      "[ INFO ]   GPU_QUEUE_PRIORITY: Priority.MEDIUM\n",
      "[ INFO ]   GPU_QUEUE_THROTTLE: Priority.MEDIUM\n",
      "[ INFO ]   GPU_ENABLE_LOOP_UNROLLING: True\n",
      "[ INFO ]   CACHE_DIR: \n",
      "[ INFO ]   PERFORMANCE_HINT: PerformanceMode.LATENCY\n",
      "[ INFO ]   COMPILATION_NUM_THREADS: 20\n",
      "[ INFO ]   NUM_STREAMS: 1\n",
      "[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0\n",
      "[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'undefined'>\n",
      "[ INFO ]   DEVICE_ID: 0\n",
      "[Step 9/11] Creating infer requests and preparing input tensors\n",
      "[ WARNING ] No input files were given for input 'image_tensor'!. This input will be filled with random values!\n",
      "[ INFO ] Fill input 'image_tensor' with random values \n",
      "[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests, limits: 60000 ms duration)\n",
      "[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).\n",
      "[ INFO ] First inference took 8.79 ms\n",
      "[Step 11/11] Dumping statistics report\n",
      "[ INFO ] Count:            11354 iterations\n",
      "[ INFO ] Duration:         60007.21 ms\n",
      "[ INFO ] Latency:\n",
      "[ INFO ]    Median:        4.57 ms\n",
      "[ INFO ]    Average:       5.16 ms\n",
      "[ INFO ]    Min:           3.18 ms\n",
      "[ INFO ]    Max:           34.87 ms\n",
      "[ INFO ] Throughput:   189.21 FPS\n"
     ]
    }
   ],
   "source": [
    "!benchmark_app -m {model_path} -d GPU -hint latency"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "9432edbe-3078-4534-a0cc-6bfe4bc00b28",
   "metadata": {},
   "source": [
    "#### CPU vs GPU with Throughput Hint\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "2c97e8db-f97b-466c-9c2e-f660f2084648",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:34:24.690629Z",
     "start_time": "2023-05-29T16:33:24.124456Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Step 1/11] Parsing and validating input arguments\n",
      "[ INFO ] Parsing input parameters\n",
      "[Step 2/11] Loading OpenVINO Runtime\n",
      "[ INFO ] OpenVINO:\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] Device info:\n",
      "[ INFO ] CPU\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] \n",
      "[Step 3/11] Setting device configuration\n",
      "[Step 4/11] Reading model files\n",
      "[ INFO ] Loading model files\n",
      "[ INFO ] Read model took 29.56 ms\n",
      "[ INFO ] Original model I/O parameters:\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor:0 , image_tensor (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 5/11] Resizing model to match image sizes and given batch\n",
      "[ INFO ] Model batch size: 1\n",
      "[Step 6/11] Configuring input of the model\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor:0 , image_tensor (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 7/11] Loading the model to the device\n",
      "[ INFO ] Compile model took 158.91 ms\n",
      "[Step 8/11] Querying optimal runtime parameters\n",
      "[ INFO ] Model:\n",
      "[ INFO ]   NETWORK_NAME: frozen_inference_graph\n",
      "[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 5\n",
      "[ INFO ]   NUM_STREAMS: 5\n",
      "[ INFO ]   AFFINITY: Affinity.CORE\n",
      "[ INFO ]   INFERENCE_NUM_THREADS: 20\n",
      "[ INFO ]   PERF_COUNT: False\n",
      "[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>\n",
      "[ INFO ]   PERFORMANCE_HINT: PerformanceMode.THROUGHPUT\n",
      "[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0\n",
      "[Step 9/11] Creating infer requests and preparing input tensors\n",
      "[ WARNING ] No input files were given for input 'image_tensor'!. This input will be filled with random values!\n",
      "[ INFO ] Fill input 'image_tensor' with random values \n",
      "[Step 10/11] Measuring performance (Start inference asynchronously, 5 inference requests, limits: 60000 ms duration)\n",
      "[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).\n",
      "[ INFO ] First inference took 8.15 ms\n",
      "[Step 11/11] Dumping statistics report\n",
      "[ INFO ] Count:            25240 iterations\n",
      "[ INFO ] Duration:         60010.99 ms\n",
      "[ INFO ] Latency:\n",
      "[ INFO ]    Median:        10.16 ms\n",
      "[ INFO ]    Average:       11.84 ms\n",
      "[ INFO ]    Min:           7.96 ms\n",
      "[ INFO ]    Max:           37.53 ms\n",
      "[ INFO ] Throughput:   420.59 FPS\n"
     ]
    }
   ],
   "source": [
    "!benchmark_app -m {model_path} -d CPU -hint throughput"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "b586a49c-8eb8-4691-abf1-66ecb2b6334c",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:35:28.525784Z",
     "start_time": "2023-05-29T16:34:24.691337Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Step 1/11] Parsing and validating input arguments\n",
      "[ INFO ] Parsing input parameters\n",
      "[Step 2/11] Loading OpenVINO Runtime\n",
      "[ INFO ] OpenVINO:\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] Device info:\n",
      "[ INFO ] GPU\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] \n",
      "[Step 3/11] Setting device configuration\n",
      "[Step 4/11] Reading model files\n",
      "[ INFO ] Loading model files\n",
      "[ INFO ] Read model took 15.45 ms\n",
      "[ INFO ] Original model I/O parameters:\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 5/11] Resizing model to match image sizes and given batch\n",
      "[ INFO ] Model batch size: 1\n",
      "[Step 6/11] Configuring input of the model\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 7/11] Loading the model to the device\n",
      "[ INFO ] Compile model took 2249.04 ms\n",
      "[Step 8/11] Querying optimal runtime parameters\n",
      "[ INFO ] Model:\n",
      "[ INFO ]   NETWORK_NAME: frozen_inference_graph\n",
      "[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4\n",
      "[ INFO ]   PERF_COUNT: False\n",
      "[ INFO ]   MODEL_PRIORITY: Priority.MEDIUM\n",
      "[ INFO ]   GPU_HOST_TASK_PRIORITY: Priority.MEDIUM\n",
      "[ INFO ]   GPU_QUEUE_PRIORITY: Priority.MEDIUM\n",
      "[ INFO ]   GPU_QUEUE_THROTTLE: Priority.MEDIUM\n",
      "[ INFO ]   GPU_ENABLE_LOOP_UNROLLING: True\n",
      "[ INFO ]   CACHE_DIR: \n",
      "[ INFO ]   PERFORMANCE_HINT: PerformanceMode.THROUGHPUT\n",
      "[ INFO ]   COMPILATION_NUM_THREADS: 20\n",
      "[ INFO ]   NUM_STREAMS: 2\n",
      "[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0\n",
      "[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'undefined'>\n",
      "[ INFO ]   DEVICE_ID: 0\n",
      "[Step 9/11] Creating infer requests and preparing input tensors\n",
      "[ WARNING ] No input files were given for input 'image_tensor'!. This input will be filled with random values!\n",
      "[ INFO ] Fill input 'image_tensor' with random values \n",
      "[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 60000 ms duration)\n",
      "[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).\n",
      "[ INFO ] First inference took 9.17 ms\n",
      "[Step 11/11] Dumping statistics report\n",
      "[ INFO ] Count:            19588 iterations\n",
      "[ INFO ] Duration:         60023.47 ms\n",
      "[ INFO ] Latency:\n",
      "[ INFO ]    Median:        11.31 ms\n",
      "[ INFO ]    Average:       12.15 ms\n",
      "[ INFO ]    Min:           9.26 ms\n",
      "[ INFO ]    Max:           36.04 ms\n",
      "[ INFO ] Throughput:   326.34 FPS\n"
     ]
    }
   ],
   "source": [
    "!benchmark_app -m {model_path} -d GPU -hint throughput"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "bd8e3e87-bdbf-4424-85aa-28c9b0c144f9",
   "metadata": {},
   "source": [
    "#### Single GPU vs Multiple GPUs\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "778ebb5f-4e97-42d1-bb64-69dcd13833e8",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:35:30.022537Z",
     "start_time": "2023-05-29T16:35:28.528150Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Step 1/11] Parsing and validating input arguments\n",
      "[ INFO ] Parsing input parameters\n",
      "[Step 2/11] Loading OpenVINO Runtime\n",
      "[ INFO ] OpenVINO:\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] Device info:\n",
      "[ INFO ] GPU\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] \n",
      "[Step 3/11] Setting device configuration\n",
      "[ WARNING ] Device GPU.1 does not support performance hint property(-hint).\n",
      "[ ERROR ] Config for device with 1 ID is not registered in GPU plugin\n",
      "Traceback (most recent call last):\n",
      "  File \"/home/adrian/repos/openvino_notebooks/venv/lib/python3.9/site-packages/openvino/tools/benchmark/main.py\", line 329, in main\n",
      "    benchmark.set_config(config)\n",
      "  File \"/home/adrian/repos/openvino_notebooks/venv/lib/python3.9/site-packages/openvino/tools/benchmark/benchmark.py\", line 57, in set_config\n",
      "    self.core.set_property(device, config[device])\n",
      "RuntimeError: Config for device with 1 ID is not registered in GPU plugin\n"
     ]
    }
   ],
   "source": [
    "!benchmark_app -m {model_path} -d GPU.1 -hint throughput"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "af4d5eac-55e9-4850-a191-af96e56855e4",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:35:30.439453Z",
     "start_time": "2023-05-29T16:35:30.035612Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Step 1/11] Parsing and validating input arguments\n",
      "[ INFO ] Parsing input parameters\n",
      "[Step 2/11] Loading OpenVINO Runtime\n",
      "[ INFO ] OpenVINO:\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] Device info:\n",
      "[ INFO ] AUTO\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] GPU\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] \n",
      "[Step 3/11] Setting device configuration\n",
      "[ WARNING ] Device GPU.1 does not support performance hint property(-hint).\n",
      "[Step 4/11] Reading model files\n",
      "[ INFO ] Loading model files\n",
      "[ INFO ] Read model took 26.66 ms\n",
      "[ INFO ] Original model I/O parameters:\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 5/11] Resizing model to match image sizes and given batch\n",
      "[ INFO ] Model batch size: 1\n",
      "[Step 6/11] Configuring input of the model\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 7/11] Loading the model to the device\n",
      "[ ERROR ] Config for device with 1 ID is not registered in GPU plugin\n",
      "Traceback (most recent call last):\n",
      "  File \"/home/adrian/repos/openvino_notebooks/venv/lib/python3.9/site-packages/openvino/tools/benchmark/main.py\", line 414, in main\n",
      "    compiled_model = benchmark.core.compile_model(model, benchmark.device)\n",
      "  File \"/home/adrian/repos/openvino_notebooks/venv/lib/python3.9/site-packages/openvino/runtime/ie_api.py\", line 399, in compile_model\n",
      "    super().compile_model(model, device_name, {} if config is None else config),\n",
      "RuntimeError: Config for device with 1 ID is not registered in GPU plugin\n"
     ]
    }
   ],
   "source": [
    "!benchmark_app -m {model_path} -d AUTO:GPU.1,GPU.0 -hint cumulative_throughput"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "85db3ae8-8c98-46a3-82fa-ab9539c96a8c",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:35:30.832687Z",
     "start_time": "2023-05-29T16:35:30.441528Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Step 1/11] Parsing and validating input arguments\n",
      "[ INFO ] Parsing input parameters\n",
      "[Step 2/11] Loading OpenVINO Runtime\n",
      "[ INFO ] OpenVINO:\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] Device info:\n",
      "[ INFO ] GPU\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] MULTI\n",
      "[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3\n",
      "[ INFO ] \n",
      "[ INFO ] \n",
      "[Step 3/11] Setting device configuration\n",
      "[ WARNING ] Device GPU.1 does not support performance hint property(-hint).\n",
      "[Step 4/11] Reading model files\n",
      "[ INFO ] Loading model files\n",
      "[ INFO ] Read model took 14.84 ms\n",
      "[ INFO ] Original model I/O parameters:\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor:0 , image_tensor (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 5/11] Resizing model to match image sizes and given batch\n",
      "[ INFO ] Model batch size: 1\n",
      "[Step 6/11] Configuring input of the model\n",
      "[ INFO ] Model inputs:\n",
      "[ INFO ]     image_tensor:0 , image_tensor (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]\n",
      "[ INFO ] Model outputs:\n",
      "[ INFO ]     detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]\n",
      "[Step 7/11] Loading the model to the device\n",
      "[ ERROR ] Config for device with 1 ID is not registered in GPU plugin\n",
      "Traceback (most recent call last):\n",
      "  File \"/home/adrian/repos/openvino_notebooks/venv/lib/python3.9/site-packages/openvino/tools/benchmark/main.py\", line 414, in main\n",
      "    compiled_model = benchmark.core.compile_model(model, benchmark.device)\n",
      "  File \"/home/adrian/repos/openvino_notebooks/venv/lib/python3.9/site-packages/openvino/runtime/ie_api.py\", line 399, in compile_model\n",
      "    super().compile_model(model, device_name, {} if config is None else config),\n",
      "RuntimeError: Config for device with 1 ID is not registered in GPU plugin\n"
     ]
    }
   ],
   "source": [
    "!benchmark_app -m {model_path} -d MULTI:GPU.1,GPU.0 -hint throughput"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "d3fd1496-996d-49d8-80b9-3bfb3dda9729",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Basic Application Using GPUs\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "53a25247-a69f-40a6-be52-ea6de89bf470",
   "metadata": {},
   "source": [
    "We will now show an end-to-end object detection example using GPUs in OpenVINO. The application compiles a model on GPU with the \"THROUGHPUT\" hint, then loads a video and preprocesses every frame to convert them to the shape expected by the model. Once the frames are loaded, it sets up an asynchronous pipeline, performs inference and saves the detections found in each frame. The detections are then drawn on their corresponding frame and saved as a video, which is displayed at the end of the application."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "e1686090-2a2e-4ae6-8f22-7ef458ce221b",
   "metadata": {},
   "source": [
    "### Import Necessary Packages\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "55e0b948-ec1f-4ab8-9418-103dc8f06c19",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:35:30.841379Z",
     "start_time": "2023-05-29T16:35:30.838113Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['CPU', 'GPU']"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import time\n",
    "from pathlib import Path\n",
    "\n",
    "import cv2\n",
    "import numpy as np\n",
    "from IPython.display import Video\n",
    "import openvino as ov\n",
    "\n",
    "# Instantiate OpenVINO Runtime\n",
    "core = ov.Core()\n",
    "core.available_devices"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "eadfb30e-d6d3-4ed3-8e7e-f1a843ce2ec0",
   "metadata": {},
   "source": [
    "### Compile the Model\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "8648ae4f-2c73-4525-ace0-1c71afb18ec4",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:35:33.052928Z",
     "start_time": "2023-05-29T16:35:30.843197Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Model input shape: 1 300 300 3\n"
     ]
    }
   ],
   "source": [
    "# Read model and compile it on GPU in THROUGHPUT mode\n",
    "model = core.read_model(model=ov_model_path)\n",
    "device_name = \"GPU\"\n",
    "compiled_model = core.compile_model(model=model, device_name=device_name, config={hints.performance_mode(): hints.PerformanceMode.THROUGHPUT})\n",
    "\n",
    "# Get the input and output nodes\n",
    "input_layer = compiled_model.input(0)\n",
    "output_layer = compiled_model.output(0)\n",
    "\n",
    "# Get the input size\n",
    "num, height, width, channels = input_layer.shape\n",
    "print(\"Model input shape:\", num, height, width, channels)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "083bce3d-647d-46d1-8540-4a2935bc829f",
   "metadata": {},
   "source": [
    "### Load and Preprocess Video Frames\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "fcee85d9-eab1-4a94-bb2f-d5217fd7cdcf",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:35:33.502639Z",
     "start_time": "2023-05-29T16:35:33.054994Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading video...\n",
      "Video loaded!\n",
      "Frame shape:  (1, 300, 300, 3)\n",
      "Number of frames:  288\n"
     ]
    }
   ],
   "source": [
    "# Load video\n",
    "video_file = \"https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/video/Coco%20Walking%20in%20Berkeley.mp4\"\n",
    "video = cv2.VideoCapture(video_file)\n",
    "framebuf = []\n",
    "\n",
    "# Go through every frame of video and resize it\n",
    "print(\"Loading video...\")\n",
    "while video.isOpened():\n",
    "    ret, frame = video.read()\n",
    "    if not ret:\n",
    "        print(\"Video loaded!\")\n",
    "        video.release()\n",
    "        break\n",
    "\n",
    "    # Preprocess frames - convert them to shape expected by model\n",
    "    input_frame = cv2.resize(src=frame, dsize=(width, height), interpolation=cv2.INTER_AREA)\n",
    "    input_frame = np.expand_dims(input_frame, axis=0)\n",
    "\n",
    "    # Append frame to framebuffer\n",
    "    framebuf.append(input_frame)\n",
    "\n",
    "\n",
    "print(\"Frame shape: \", framebuf[0].shape)\n",
    "print(\"Number of frames: \", len(framebuf))\n",
    "\n",
    "# Show original video file\n",
    "# If the video does not display correctly inside the notebook, please open it with your favorite media player\n",
    "Video(video_file)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "7003286a-8349-46f6-9d38-c6bb959e892a",
   "metadata": {},
   "source": [
    "### Define Model Output Classes\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "621b919e-c120-42f9-89eb-ffd7b810c91f",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:35:33.509992Z",
     "start_time": "2023-05-29T16:35:33.506106Z"
    }
   },
   "outputs": [],
   "source": [
    "# Define the model's labelmap (this model uses COCO classes)\n",
    "classes = [\n",
    "    \"background\",\n",
    "    \"person\",\n",
    "    \"bicycle\",\n",
    "    \"car\",\n",
    "    \"motorcycle\",\n",
    "    \"airplane\",\n",
    "    \"bus\",\n",
    "    \"train\",\n",
    "    \"truck\",\n",
    "    \"boat\",\n",
    "    \"traffic light\",\n",
    "    \"fire hydrant\",\n",
    "    \"street sign\",\n",
    "    \"stop sign\",\n",
    "    \"parking meter\",\n",
    "    \"bench\",\n",
    "    \"bird\",\n",
    "    \"cat\",\n",
    "    \"dog\",\n",
    "    \"horse\",\n",
    "    \"sheep\",\n",
    "    \"cow\",\n",
    "    \"elephant\",\n",
    "    \"bear\",\n",
    "    \"zebra\",\n",
    "    \"giraffe\",\n",
    "    \"hat\",\n",
    "    \"backpack\",\n",
    "    \"umbrella\",\n",
    "    \"shoe\",\n",
    "    \"eye glasses\",\n",
    "    \"handbag\",\n",
    "    \"tie\",\n",
    "    \"suitcase\",\n",
    "    \"frisbee\",\n",
    "    \"skis\",\n",
    "    \"snowboard\",\n",
    "    \"sports ball\",\n",
    "    \"kite\",\n",
    "    \"baseball bat\",\n",
    "    \"baseball glove\",\n",
    "    \"skateboard\",\n",
    "    \"surfboard\",\n",
    "    \"tennis racket\",\n",
    "    \"bottle\",\n",
    "    \"plate\",\n",
    "    \"wine glass\",\n",
    "    \"cup\",\n",
    "    \"fork\",\n",
    "    \"knife\",\n",
    "    \"spoon\",\n",
    "    \"bowl\",\n",
    "    \"banana\",\n",
    "    \"apple\",\n",
    "    \"sandwich\",\n",
    "    \"orange\",\n",
    "    \"broccoli\",\n",
    "    \"carrot\",\n",
    "    \"hot dog\",\n",
    "    \"pizza\",\n",
    "    \"donut\",\n",
    "    \"cake\",\n",
    "    \"chair\",\n",
    "    \"couch\",\n",
    "    \"potted plant\",\n",
    "    \"bed\",\n",
    "    \"mirror\",\n",
    "    \"dining table\",\n",
    "    \"window\",\n",
    "    \"desk\",\n",
    "    \"toilet\",\n",
    "    \"door\",\n",
    "    \"tv\",\n",
    "    \"laptop\",\n",
    "    \"mouse\",\n",
    "    \"remote\",\n",
    "    \"keyboard\",\n",
    "    \"cell phone\",\n",
    "    \"microwave\",\n",
    "    \"oven\",\n",
    "    \"toaster\",\n",
    "    \"sink\",\n",
    "    \"refrigerator\",\n",
    "    \"blender\",\n",
    "    \"book\",\n",
    "    \"clock\",\n",
    "    \"vase\",\n",
    "    \"scissors\",\n",
    "    \"teddy bear\",\n",
    "    \"hair drier\",\n",
    "    \"toothbrush\",\n",
    "    \"hair brush\",\n",
    "]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "0ad10b97-e32e-4885-af1d-0636e970b707",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Set up Asynchronous Pipeline\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "91f6ed58-8b3f-415e-a733-a7517e200b12",
   "metadata": {},
   "source": [
    "#### Callback Definition\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "1268e1a4-b530-4b39-b79d-d6d5f988a077",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:35:33.513820Z",
     "start_time": "2023-05-29T16:35:33.510248Z"
    }
   },
   "outputs": [],
   "source": [
    "# Define a callback function that runs every time the asynchronous pipeline completes inference on a frame\n",
    "def completion_callback(infer_request: ov.InferRequest, frame_id: int) -> None:\n",
    "    global frame_number\n",
    "    stop_time = time.time()\n",
    "    frame_number += 1\n",
    "\n",
    "    predictions = next(iter(infer_request.results.values()))\n",
    "    results[frame_id] = predictions[:10]  # Grab first 10 predictions for this frame\n",
    "\n",
    "    total_time = stop_time - start_time\n",
    "    frame_fps[frame_id] = frame_number / total_time"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "2d2f6aba-b3ba-478b-b02f-6b438b575c33",
   "metadata": {},
   "source": [
    "#### Create Async Pipeline\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "56e8eae2-20a9-43dc-8774-444ad0179136",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:35:33.521320Z",
     "start_time": "2023-05-29T16:35:33.511945Z"
    }
   },
   "outputs": [],
   "source": [
    "# Create asynchronous inference queue with optimal number of infer requests\n",
    "infer_queue = ov.AsyncInferQueue(compiled_model)\n",
    "infer_queue.set_callback(completion_callback)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "fb47ae59-b86d-46c4-b219-e7beccd5f796",
   "metadata": {},
   "source": [
    "### Perform Inference\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "59f493dd-4cb7-4ae2-b117-7ba41a950d86",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:35:34.893571Z",
     "start_time": "2023-05-29T16:35:33.519906Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Total time to infer all frames: 1.366s\n",
      "Time per frame: 0.004744s (210.774 FPS)\n"
     ]
    }
   ],
   "source": [
    "# Perform inference on every frame in the framebuffer\n",
    "results = {}\n",
    "frame_fps = {}\n",
    "frame_number = 0\n",
    "start_time = time.time()\n",
    "for i, input_frame in enumerate(framebuf):\n",
    "    infer_queue.start_async({0: input_frame}, i)\n",
    "\n",
    "infer_queue.wait_all()  # Wait until all inference requests in the AsyncInferQueue are completed\n",
    "stop_time = time.time()\n",
    "\n",
    "# Calculate total inference time and FPS\n",
    "total_time = stop_time - start_time\n",
    "fps = len(framebuf) / total_time\n",
    "time_per_frame = 1 / fps\n",
    "print(f\"Total time to infer all frames: {total_time:.3f}s\")\n",
    "print(f\"Time per frame: {time_per_frame:.6f}s ({fps:.3f} FPS)\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "7fd90b68-ae14-4fe9-98c7-e8d23146c82c",
   "metadata": {},
   "source": [
    "### Process Results\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "1ccba843-e40b-43a8-b2b4-8ebd67bc1424",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-05-29T16:40:11.393739Z",
     "start_time": "2023-05-29T16:40:11.204923Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Video loaded!\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<video controls  width=\"800\" >\n",
       " <source src=\"data:None;base64,output/output.mp4\" type=\"None\">\n",
       " Your browser does not support the video tag.\n",
       " </video>"
      ],
      "text/plain": [
       "<IPython.core.display.Video object>"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Set minimum detection threshold\n",
    "min_thresh = 0.6\n",
    "\n",
    "# Load video\n",
    "video = cv2.VideoCapture(video_file)\n",
    "\n",
    "# Get video parameters\n",
    "frame_width = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))\n",
    "frame_height = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))\n",
    "fps = int(video.get(cv2.CAP_PROP_FPS))\n",
    "fourcc = int(video.get(cv2.CAP_PROP_FOURCC))\n",
    "\n",
    "# Create folder and VideoWriter to save output video\n",
    "Path(\"./output\").mkdir(exist_ok=True)\n",
    "output = cv2.VideoWriter(\"output/output.mp4\", fourcc, fps, (frame_width, frame_height))\n",
    "\n",
    "# Draw detection results on every frame of video and save as a new video file\n",
    "while video.isOpened():\n",
    "    current_frame = int(video.get(cv2.CAP_PROP_POS_FRAMES))\n",
    "    ret, frame = video.read()\n",
    "    if not ret:\n",
    "        print(\"Video loaded!\")\n",
    "        output.release()\n",
    "        video.release()\n",
    "        break\n",
    "\n",
    "    # Draw info at the top left such as current fps, the devices and the performance hint being used\n",
    "    cv2.putText(\n",
    "        frame,\n",
    "        f\"fps {str(round(frame_fps[current_frame], 2))}\",\n",
    "        (5, 20),\n",
    "        cv2.FONT_ITALIC,\n",
    "        0.6,\n",
    "        (0, 0, 0),\n",
    "        1,\n",
    "        cv2.LINE_AA,\n",
    "    )\n",
    "    cv2.putText(\n",
    "        frame,\n",
    "        f\"device {device_name}\",\n",
    "        (5, 40),\n",
    "        cv2.FONT_ITALIC,\n",
    "        0.6,\n",
    "        (0, 0, 0),\n",
    "        1,\n",
    "        cv2.LINE_AA,\n",
    "    )\n",
    "    cv2.putText(\n",
    "        frame,\n",
    "        f\"hint {compiled_model.get_property(hints.performance_mode)}\",\n",
    "        (5, 60),\n",
    "        cv2.FONT_ITALIC,\n",
    "        0.6,\n",
    "        (0, 0, 0),\n",
    "        1,\n",
    "        cv2.LINE_AA,\n",
    "    )\n",
    "\n",
    "    # prediction contains [image_id, label, conf, x_min, y_min, x_max, y_max] according to model\n",
    "    for prediction in np.squeeze(results[current_frame]):\n",
    "        if prediction[2] > min_thresh:\n",
    "            x_min = int(prediction[3] * frame_width)\n",
    "            y_min = int(prediction[4] * frame_height)\n",
    "            x_max = int(prediction[5] * frame_width)\n",
    "            y_max = int(prediction[6] * frame_height)\n",
    "            label = classes[int(prediction[1])]\n",
    "\n",
    "            # Draw a bounding box with its label above it\n",
    "            cv2.rectangle(frame, (x_min, y_min), (x_max, y_max), (0, 255, 0), 1, cv2.LINE_AA)\n",
    "            cv2.putText(\n",
    "                frame,\n",
    "                label,\n",
    "                (x_min, y_min - 10),\n",
    "                cv2.FONT_ITALIC,\n",
    "                1,\n",
    "                (255, 0, 0),\n",
    "                1,\n",
    "                cv2.LINE_AA,\n",
    "            )\n",
    "\n",
    "    output.write(frame)\n",
    "\n",
    "# Show output video file\n",
    "# If the video does not display correctly inside the notebook, please open it with your favorite media player\n",
    "Video(\"output/output.mp4\", width=800, embed=True)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "58fb6bb6-0d1c-4b0b-8a0b-efb5667816ea",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Conclusion\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "dbbb4f28-2917-45ca-8fa0-8c670c0348bc",
   "metadata": {},
   "source": [
    "This tutorial demonstrates how easy it is to use one or more GPUs in OpenVINO, check their properties, and even tailor the model performance through the different performance hints. It also provides a walk-through of a basic object detection application that uses a GPU and displays the detected bounding boxes.\n",
    "\n",
    "To read more about any of these topics, feel free to visit their corresponding documentation:\n",
    "\n",
    "* [GPU Plugin](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/gpu-device.html)\n",
    "* [AUTO Plugin](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/auto-device-selection.html)\n",
    "* [Model Caching](https://docs.openvino.ai/2024/openvino-workflow/running-inference/optimize-inference/optimizing-latency/model-caching-overview.html)\n",
    "* [MULTI Device Mode](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/multi-device.html)\n",
    "* [Query Device Properties](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/query-device-properties.html)\n",
    "* [Configurations for GPUs with OpenVINO](https://docs.openvino.ai/2024/get-started/configurations/configurations-intel-gpu.html)\n",
    "* [Benchmark Python Tool](https://docs.openvino.ai/2024/learn-openvino/openvino-samples/benchmark-tool.html)\n",
    "* [Asynchronous Inferencing](https://docs.openvino.ai/2024/documentation/openvino-extensibility/openvino-plugin-library/asynch-inference-request.html)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  },
  "openvino_notebooks": {
   "imageUrl": "",
   "tags": {
    "categories": [
     "API Overview"
    ],
    "libraries": [],
    "other": [],
    "tasks": [
     "Object Detection"
    ]
   }
  },
  "vscode": {
   "interpreter": {
    "hash": "cec18e25feb9469b5ff1085a8097bdcd86db6a4ac301d6aeff87d0f3e7ce4ca5"
   }
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
